home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Celestin Apprentice 5
/
Apprentice-Release5.iso
/
Source Code
/
C
/
Applications
/
Python 1.3.3
/
libjpeg
/
architecture
< prev
next >
Wrap
Text File
|
1996-02-28
|
67KB
|
1,117 lines
JPEG SYSTEM ARCHITECTURE 1-DEC-92
This file provides an overview of the "architecture" of the portable JPEG
software; that is, the functions of the various modules in the system and the
interfaces between modules. For more precise details about any data structure
or calling convention, see the header files.
Important note: when I say "module" I don't mean "a C function", which is what
some people seem to think the term means. A separate C source file is closer
to the mark. Also, it is frequently the case that several different modules
present a common interface to callers; the term "object" or "method" refers to
this common interface (see "Poor man's object-oriented programming", below).
JPEG-specific terminology follows the JPEG standard:
A "component" means a color channel, e.g., Red or Luminance.
A "sample" is a pixel component value (i.e., one number in the image data).
A "coefficient" is a frequency coefficient (a DCT transform output number).
The term "block" refers to an 8x8 group of samples or coefficients.
"MCU" (minimum coded unit) is the same as "MDU" of the R8 draft; i.e., an
interleaved set of blocks of size determined by the sampling factors,
or a single block in a noninterleaved scan.
*** System requirements ***
We must support compression and decompression of both Huffman and
arithmetic-coded JPEG files. Any set of compression parameters allowed by the
JPEG spec should be readable for decompression. (We can be more restrictive
about what formats we can generate.) (Note: for legal reasons no arithmetic
coding implementation is currently included in the publicly available sources.
However, the architecture still supports it.)
We need to be able to handle both raw JPEG files (more specifically, the JFIF
format) and JPEG-in-TIFF (C-cubed's format, and perhaps Kodak's). Even if we
don't implement TIFF ourselves, other people will want to use our code for
that. This means that generation and scanning of the file header has to be
separated out.
Perhaps we should be prepared to support the JPEG lossless mode (also referred
to in the spec as spatial DPCM coding). A lot of people seem to believe they
need this... whether they really do is debatable, but the customer is always
right. On the other hand, there will not be much sharable code between the
lossless and lossy modes! At best, a lossless program could be derived from
parts of the lossy version. For now we will only worry about the lossy mode.
I see no real value in supporting the JPEG progressive modes (note that
spectral selection and successive approximation are two different progressive
modes). These are only of interest when painting the decompressed image in
real-time, which nobody is going to do with a pure software implementation.
There is some value in supporting the hierarchical mode, which allows for
successive frames of higher resolution. This could be of use for including
"thumbnail" representations. However, this appears to add a lot more
complexity than it is worth.
A variety of uncompressed image file formats and user interfaces must be
supported. These aspects therefore have to be kept separate from the rest of
the system. A particularly important issue is whether color quantization of
the output is needed (i.e., whether a colormap is used). We should be able to
support both adaptive quantization (which requires two or more passes over the
image) and nonadaptive (quantization to a prespecified colormap, which can be
done in one pass).
Memory usage is an important concern, since we will port this code to 80x86
and other limited-memory machines. For large intermediate structures, we
should be able to use either virtual memory or temporary files.
It should be possible to build programs that handle compression only,
decompression only, or both, without much duplicate or unused code in any
version. (In particular, a decompression-only version should have no extra
baggage.)
*** Compression overview ***
The *logical* steps needed in (non-lossless) JPEG compression are:
1. Conversion from incoming image format to a standardized internal form
(either RGB or grayscale).
2. Color space conversion (e.g., RGB to YCbCr). This is a null step for
grayscale (unless we support mapping color inputs to grayscale, which
would most easily be done here). Gamma adjustment may also be needed here.
3. Downsampling (reduction of number of samples in some color components).
This step operates independently on each color component.
4. MCU extraction (creation of a single sequence of 8x8 sample blocks).
This step and the following ones are performed once for each scan
in the output JPEG file, i.e., once if making an interleaved file and more
than once for a noninterleaved file.
Note: both this step and the previous one must deal with edge conditions
for pictures that aren't a multiple of the MCU dimensions. Alternately,
we could expand the picture to a multiple of an MCU before doing these
two steps. (The latter seems better and has been adopted below.)
5. DCT transformation of each 8x8 block.
6. Quantization scaling and zigzag reordering of the elements in each 8x8
block.
7. Huffman or arithmetic encoding of the transformed block sequence.
8. Output of the JPEG file with whatever headers/markers are wanted.
Of course, the actual implementation will combine some of these logical steps
for efficiency. The trick is to keep these logical functions as separate as
possible without losing too much performance.
In addition to these logical pipeline steps, we need various modules that
aren't part of the data pipeline. These are:
A. Overall control (sequencing of other steps & management of data passing).
B. User interface; this will determine the input and output files, and supply
values for some compression parameters. Note that this module is highly
platform-dependent.
C. Compression parameter selection: some parameters should be chosen
automatically rather than requiring the user to find a good value.
The prototype only does this for the back-end (Huffman or arithmetic)
parameters, but further in the future, more might be done. A
straightforward approach to selection is to try several values; this
requires being able to repeatedly apply some portion of the pipeline and
inspect the results (without actually outputting them). Probably only
entropy encoding parameters can reasonably be done this way; optimizing
earlier steps would require too much data to be reprocessed (not to mention
the problem of interactions between parameters for different steps).
What other facilities do we need to support automatic parameter selection?
D. A memory management module to deal with small-memory machines. This must
create the illusion of virtual memory for certain large data structures
(e.g., the downsampled image or the transformed coefficients).
The interface to this must be defined to minimize the overhead incurred,
especially on virtual-memory machines where the module won't do much.
In many cases we can arrange things so that a data stream is produced in
segments by one module and consumed by another without the need to hold it all
in (virtual) memory. This is obviously not possible for any data that must be
scanned more than once, so it won't work everywhere.
The major variable at this level of detail is whether the JPEG file is to be
interleaved or not; that affects the order of processing so fundamentally that
the central control module must know about it. Some of the other modules may
need to know it too. It would simplify life if we didn't need to support
noninterleaved images, but that is not reasonable.
Many of these steps operate independently on each color component; the
knowledge of how many components there are, and how they are interleaved,
ought to be confined to the central control module. (Color space conversion
and MCU extraction probably have to know it too.)
*** Decompression overview ***
Decompression is roughly the inverse process from compression, but there are
some additional steps needed to produce a good output image.
The *logical* steps needed in (non-lossless) JPEG decompression are:
1. Scanning of the JPEG file, decoding of headers/markers etc.
2. Huffman or arithmetic decoding of the coefficient sequence.
3. Quantization descaling and zigzag reordering of the elements in each 8x8
block.
4. MCU disassembly (conversion of a possibly interleaved sequence of 8x8
blocks back to separate components in pixel map order).
5. (Optional) Cross-block smoothing per JPEG section K.8 or a similar
algorithm. (Steps 5-8 operate independently on each component.)
6. Inverse DCT transformation of each 8x8 block.
7. Upsampling. At this point a pixel image of the original dimensions
has been recreated.
8. Post-upsampling smoothing. This can be combined with upsampling,
by using a convolution-like calculation to generate each output pixel
directly from one or more input pixels.
9. Cropping to the original pixel dimensions (throwing away duplicated
pixels at the edges). It is most convenient to do this now, as the
preceding steps are simplified by not having to worry about odd picture
sizes.
10. Color space reconversion (e.g., YCbCr to RGB). This is a null step for
grayscale. (Note that mapping a color JPEG to grayscale output is most
easily done in this step.) Gamma adjustment may also be needed here.
11. Color quantization (only if a colormapped output format is requested).
NOTE: it is probably preferable to perform quantization in the internal
(JPEG) colorspace rather than the output colorspace. Doing it that way,
color conversion need only be applied to the colormap entries, not to
every pixel; and quantization gets to operate in a non-gamma-corrected
space. But the internal space may not be suitable for some algorithms.
The system design is such that only the color quantizer module knows
whether color conversion happens before or after quantization.
12. Writing of the desired image format.
As before, some of these will be combined into single steps.